{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c6cb6fd3-0e03-4a7e-82dd-b0104c6400ee",
   "metadata": {},
   "source": [
    "# Using hvPlot as a Pandas user"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "72ee8178-a268-4cc3-b35f-ed2979fa9536",
   "metadata": {},
   "source": [
    "## Prerequisites\n",
    "\n",
    "| What? | Why? |\n",
    "| --- | --- |\n",
    "| [Getting Started](./getting_started.ipynb) | For discovering hvPlot's basics |"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "475dd07c-2526-424d-b03d-5ccf5cec00fb",
   "metadata": {},
   "source": [
    "## Overview"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bba9bde5-8530-4431-917a-0db9bfb6a0de",
   "metadata": {},
   "source": [
    "Pandas [`.plot()` API](https://pandas.pydata.org/docs/user_guide/visualization.html) has slowly emerged as a de-facto standard for high-level plotting APIs in Python, providing a simple interface to generate plots directly from a DataFrame or Series object. Many libraries have implemented this interface, providing additional and varied capabilities to Python users (e.g. [Plotly Express](https://plotly.com/python/pandas-backend) to generate Plotly plots, [Xarray](https://docs.xarray.dev/en/stable/user-guide/plotting.html) to generate plots from multidimensional datasets). hvPlot is one of these libraries that provides an interface heavily inspired by Pandas', and that extends it in many ways.\n",
    "\n",
    "Before diving more into this tutorial, let's first list what reasons, as a Pandas user, may lead you to add hvPlot to your toolbox:\n",
    "\n",
    "- By default hvPlot uses [Bokeh](https://bokeh.org/) to generate *interactive* plots (zoom, pan, hover, etc.) that are very well suited for data exploration workflows, contrary to the *static* Matplotlib plots Pandas generates.\n",
    "- hvPlot doesn't only replicate Pandas `.plot` API but extends it in many ways, exposing many of the powerful features offered by [HoloViews](https://holoviews.org/) like handling of very large datasets (with [Datashader](https://datashader.org/), geographic mapping with [GeoViews](https://geoviews.org/)), automatic drill-down with widgets, easy overlay and layout, and more.\n",
    "- hvPlot supports different objects from [other data libraries](../ref/data_libraries.ipynb) (Polars, Dask, GeoPandas, Xarray, etc.), allowing you to visually explore your non-Pandas datasets with the same plotting API.\n",
    "\n",
    "In this tutorial, we will show how to get started with hvPlot as a Pandas user, focusing on some of the basic differences between the two plotting APIs to ease your transition.\n",
    "\n",
    "## A familiar API\n",
    "\n",
    "As already mentioned, hvPlot's design has been heavily inspired by Pandas' plotting interface. This, however, doesn't mean both APIs are fully compatible; being 100% compatible is in fact a non-goal for hvPlot. To explain why, let's see how the two interfaces are designed.\n",
    "\n",
    "Pandas' `plot()` method is a convenient interface to Matplotlib:\n",
    "\n",
    ":::{mermaid}\n",
    "graph LR\n",
    "    Pandas --> Matplotlib\n",
    ":::\n",
    "\n",
    "On the other hand, hvPlot is a convenient interface to [HoloViews](https://holoviews.org), itself being an interface to plotting libraries like Bokeh, Matplotlib and Plotly:\n",
    "\n",
    ":::{mermaid}\n",
    "graph LR\n",
    "    Pandas --> hvPlot\n",
    "    hvPlot --> HoloViews\n",
    "    HoloViews --> Bokeh\n",
    "    HoloViews --> Matplotlib\n",
    "    HoloViews --> Plotly    \n",
    ":::\n",
    "\n",
    "While Matplotlib and HoloViews are both visualization libraries, they are quite different in their nature, the former being a pure plotting tool (i.e. it knows how to draw pixels onto your screen) and the latter being more of a data exploration tool. These differences explain some of the differences you will observe between Pandas (more influenced by Matplotlib) and hvPlot (more influenced by HoloViews).\n",
    "\n",
    "Yet, even if you will find some differences, as a Pandas user you should feel a great deal of **familiarity** when using hvPlot!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7967388c",
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "835ec277",
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "df = hvplot.sampledata.penguins('pandas')\n",
    "df.head(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0cb7bc6b-95f0-4fb0-8fe6-2934fb75478b",
   "metadata": {},
   "source": [
    "## Quick way to try it out: switch Pandas backend to hvPlot\n",
    "\n",
    "Pandas lets its users switch the plotting backend (default is Matplotlib) to a third-party library that implements the plotting interface. This means that you can keep your code as is, add one line to switch the plotting backend, and see how the plot looks like with another backend. Let's emulate this by first generating a standard Pandas plot:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7ed58e0a",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.plot.scatter('bill_length_mm', 'bill_depth_mm');"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1cb4b1d0",
   "metadata": {},
   "source": [
    "We switch the backend to hvPlot:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b03f9fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "pd.options.plotting.backend = 'hvplot'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "067206d7",
   "metadata": {},
   "source": [
    "From now on `.plot()` calls are going to leverage hvPlot to generate Bokeh plots:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5e2553b0",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.plot.scatter('bill_length_mm', 'bill_depth_mm')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e3fa52d0",
   "metadata": {},
   "source": [
    "## Register `.hvplot()` on Pandas objects"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7a2d8421",
   "metadata": {},
   "source": [
    "Switching Pandas plotting backend from Matplotlib to hvPlot like we just showed is an easy way to try out hvPlot but is not the long-term approach we recommend, for two main reasons:\n",
    "\n",
    "- This possibility is only offered by Pandas that provides a special entry-point for third-party libraries to register themselves as a plotting backend, but hvPlot supports many other data libraries (Dask, Polars, GeoPandas, Xarray, etc.).\n",
    "- As already mentioned, Pandas' plotting interface and hvPlot are not 100% compatible, and we find it better to be more explicit about the plotting library used.\n",
    "\n",
    "The general mechanism to register the `hvplot` attribute on data objects is via a special import `hvplot.<library>`. Once executed, the objects of that library supported by hvPlot are equipped with the `hvplot` accessor. Let's try this out with Pandas."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19cc3051",
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cb3f188b",
   "metadata": {},
   "source": [
    "After this import, the `hvplot` accessor is now available on `DataFrame` and `Series` objects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3bff3aa3",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.hvplot.scatter('bill_length_mm', 'bill_depth_mm')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d01ff4e5",
   "metadata": {},
   "outputs": [],
   "source": [
    "df['bill_length_mm'].hvplot.hist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5198dc91",
   "metadata": {},
   "outputs": [],
   "source": [
    "pd.options.plotting.backend = 'matplotlib'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94aa54b4",
   "metadata": {},
   "source": [
    "## `.hvplot()` returns HoloViews objects"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "615e531b",
   "metadata": {},
   "source": [
    "A `plot` call in Pandas returns a Matplotlib `Axes` object. This object can be passed to Pandas' `plot` API via the `ax` argument, for example to overlay two different plots (the `ax` argument is not supported in hvPlot)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2407f926",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot = df.plot.scatter('bill_length_mm', 'bill_depth_mm', figsize=(4, 3))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "95364079",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(plot)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c7e38569",
   "metadata": {},
   "source": [
    "hvPlot's plotting API returns HoloViews objects. These objects are wrappers around the original dataset, whose rich representation is a plot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7f95741e",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot = df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', hover_cols=['species'])\n",
    "print(plot)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2da9c8b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b21897f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot.data.head(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "351ff41f",
   "metadata": {},
   "source": [
    "Using HoloViews' API, this object can be further customized."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "00ca79da",
   "metadata": {},
   "outputs": [],
   "source": [
    "import holoviews as hv\n",
    "\n",
    "plot.opts(\n",
    "    height=300, width=300, color=hv.dim('species'),\n",
    "    cmap='Category10', show_legend=False,\n",
    ").hist(['bill_length_mm','bill_depth_mm'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bf4ceda2",
   "metadata": {},
   "source": [
    "## Overlays and layouts\n",
    "\n",
    "In Pandas, overlays are usually created by passing down an `Axes` object to another `plot` call via the `ax` argument. Layouts are created by setting `subplots=True`, and can be customized further with the `layout` argument, or with Matplotlib's API.\n",
    "\n",
    "The approach is quite different in hvPlot as HoloViews offers some very convenient API with `*` for overlaying plots and `+` for laying out plots. Together with the `subplots` argument and HoloViews' `.cols(N)` method to limit the number `N` of plots per row, this forms an API flexible enough to handle most situations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7113d9a5",
   "metadata": {},
   "outputs": [],
   "source": [
    "df1 = df.query('species == \"Adelie\"')\n",
    "df2 = df.query('species == \"Gentoo\"')\n",
    "ax = df1.plot.scatter('bill_length_mm', 'bill_depth_mm', color=\"blue\", label=\"Adelie\")\n",
    "df2.plot.scatter('bill_length_mm', 'bill_depth_mm', color=\"green\", label=\"Gentoo\", ax=ax);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b59ac5c3",
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    df1.hvplot.scatter('bill_length_mm', 'bill_depth_mm', color=\"blue\", label=\"Adelie\")\n",
    "    * df2.hvplot.scatter('bill_length_mm', 'bill_depth_mm', color=\"green\", label=\"Gentoo\")\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4a7d3270",
   "metadata": {},
   "outputs": [],
   "source": [
    "dft = pd.DataFrame(np.random.randn(1000, 4), columns=list(\"ABCD\")).cumsum()\n",
    "dft.plot.line(subplots=True, layout=(2, 3), figsize=(8, 6));"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2b9a6f4e",
   "metadata": {},
   "outputs": [],
   "source": [
    "dft.hvplot.line(subplots=True, width=220).cols(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "17eb1b19",
   "metadata": {},
   "outputs": [],
   "source": [
    "dft['A'].hvplot.line(width=220) + dft['B'].hvplot.line(width=220)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "115202e2",
   "metadata": {},
   "source": [
    "## Setting plot dimensions\n",
    "\n",
    "Setting plot dimensions in Pandas is done with the `figsize` argument that accepts a tuple *(width, height)* in *inches*. `figsize` is not supported in hvPlot, instead, plot dimensions are set with the `width` (default is `700`) and `height` (default is `700`) arguments that accept integer values in pixels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cfc1596b",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.plot.scatter('bill_length_mm', 'bill_depth_mm', figsize=(4, 3));"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "74d047c2",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', width=350, height=250)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "123e49c4-2944-4d8a-a529-d4ac019e3baf",
   "metadata": {},
   "source": [
    "## Widgets-based exploration\n",
    "\n",
    "In the last sections we have seen some of the main differences between Pandas and hvPlot APIs, and how you could adapt your code for hvPlot's purposes. In this section we'll see an example of how hvPlot extends Pandas' original API.\n",
    "\n",
    "You can  use the `groupby` keyword to build interactive widgets that explore different dimensions of your data. Here, we group the dataset by both `'island'` and `'sex'`, and interactive widgets let you navigate through each combination. Click on the widgets to reveal how these factors influence the visualization of the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0afea115",
   "metadata": {
    "tags": [
     "remove-output"
    ]
   },
   "outputs": [],
   "source": [
    "df.hvplot.scatter(x='bill_length_mm', y='bill_depth_mm', groupby=['island', 'sex'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cbb535b7-310f-44d9-bf27-f8807f183921",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [],
   "source": [
    "# Code hidden on the website\n",
    "df.hvplot.scatter(x='bill_length_mm', y='bill_depth_mm', groupby=['island', 'sex'], dynamic=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "78a9e737",
   "metadata": {},
   "source": [
    "## Saving a plot\n",
    "\n",
    "Saving a Bokeh plot generated with hvPlot can be done directly from your browser by clicking on the `Save` button in the toolbar, entering a name and pressing `OK`. The plot is saved as a PNG on your machine."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cea2b1d2",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot = df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', width=350, height=250)\n",
    "plot"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36348d4f",
   "metadata": {},
   "source": [
    "Alternatively, you can save a plot using the {func}`hvplot.save` utility, passing it a plot object, a file name and optional arguments. Below, we show how to save this plot as an HTML file with all of the required resources inlined (so it can be viewed offline)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cf7bf55c",
   "metadata": {},
   "outputs": [],
   "source": [
    "hvplot.save(plot, 'my_plot.html', resources='inline')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8f6e7e40",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# Hidden on the website\n",
    "import pathlib\n",
    "\n",
    "pathlib.Path('my_plot.html').unlink()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "780d6167",
   "metadata": {},
   "source": [
    "## Matplotlib plots with hvPlot\n",
    "\n",
    "hvPlot can also generate Matplotlib plots."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ca4e01d1",
   "metadata": {},
   "outputs": [],
   "source": [
    "hvplot.extension('matplotlib')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b98a0a1f",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot = df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', width=350, height=250)\n",
    "plot"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4383930a",
   "metadata": {},
   "source": [
    ":::{tip}\n",
    "Saving programmatically a Bokeh plot as a static file requires you to install some browser-based technology in your environment, which is doable but not the most practical approach. Instead, with the Matplotlib extension saving plots as PNG or SVG is straightforward.\n",
    ":::"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bac72309",
   "metadata": {},
   "outputs": [],
   "source": [
    "hvplot.save(plot, 'my_plot.png')\n",
    "hvplot.save(plot, 'my_plot.svg')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8f633520",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# Hidden on the website\n",
    "pathlib.Path('my_plot.png').unlink()\n",
    "pathlib.Path('my_plot.svg').unlink()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "078a5708-5399-4092-b24f-28777085ce59",
   "metadata": {},
   "source": [
    "## Next steps\n",
    "\n",
    "- Visit the [Pandas API compatibility](../ref/api_compatibility/pandas/index.ipynb) reference section.\n",
    "- Find out more about hvPlot by exploring this website!"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
